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Q ■ Abstract. In mechanics of structures as well as in other domains of engineering, the 
^ ■ probabilistic models often have to be computed by simulation. One of the reasons is 
^ . that the stochastic calculus involved in non linear representations yields rarely explicit 
formulas. Usually these simulations use, for each sample, a large number of calls to the 
|Vj I random function, in this case the simulation by the shift, whose field of application is as 
P_( ■ wide as that of Monte Carlo method, is particularly relevant. 

^ ■ The theoretical features, the implementation and the specific advantages of this method 

. have been taught in a course of the author at Paris VI University in 1988 and are detailed 
in the book with D. Lepingle [BL]. 



I. Other known methods of simulation 

Let us begin by the presentation of the methods to be compared with the shift. 



> 

! A. The Monte Carlo method 

For computing the expectation 1E(X) of a random variable X, it consists of 



^ ■ a) representing X as a random variable defined on the probability space 

: {[0,lY,B[0,lY,dxidx2---dxs 
■ 

^ ! with s finite or infinite, 

^ I b) imitating independent random samples of points of [0, 1]' 



ra25 • • • 1 Uyis) ) 



([/n,f/l2,...,f/ls) , (^21,f/22,...,^2s) {UnuU, 

c) and applying the law of large numbers 

1 ^ 

IE(X) = hm - ^ [/„2, . . . , f/„.). 

Comments 

• The step a) is the art of using the algorithms of simulation. It is theoretically possible 
to take s = 1, but practically, the dimension s, finite or infinite, is rather naturally given 
and is difficult to reduce. The main reference for these algorithms is [Devroye 86]. It is 
worth to note that this operation of representing a random variable X on the cube [0, 1]'' 
(s finite or not) can be effectively performed in very much more cases than those where 
the law of X is known by an explicit formula. This is, in particular, due to the rejection 
method which allows an exact simulation of a random variable even when its density is 
only known by a sequence of approximations. 
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• The step b) is the generation of pseudo random numbers. The imitation of randomness 
cannot be perfect. It has been theoretically proved by logicists, especially by Martin 
Lof. Practically extremely good generators are available with gigantic period. They are 
obtained by testing the production algorithms through statistical tests and eliminating 
those who behave badly. The tests are quite numerous (see [Niederreiter 1978], Knuth 
1981, Marsagha 1985, Ripley 1987, Fushimi 1988, Altman 1988, Anderson 1990, and other 
references given in [BL]) 

• The class of functions to which the Monte Carlo method applies is theoretically all 
functions in C^{[0,lY,dx) (here dx is the product probability measure). Practically, it 
is safe to restrict the method to bounded functions. Nevertheless the irregularity of the 
function is not limited. 

• The speed of convergence is given for X in L^, hence for X bounded, by the law of 
iterated logarithm 

hmsup I (j2 X{Ur,i, t/„2, . . . , t/n.) - Ex) = 1 

N ay/2NloglogN / 

where is the variance of X. 

There are also global estimates of the type Cramer-Chernov {X bounded) 

1 ^ 

1P(^ E ^(^nl, C^n2, . . . , C/n.) > EX + e) < e^^^^) 
-'^ n=l 

which are in connexion with large deviation theory. 

B. Quasi-Monte Carlo methods 

The step a) is preserved, but b) and c) are replaced by 
b') choosing an equidistributed sequence (^n) on [0, 1]*, 
c') putting 

1 N 

Comments • This method requires the random variable to be represented on [0, 1]* by 

a Riemann integrable function i.e. a bounded function whose discontinuity points belong 
to a negligible set, or equivalently such that Ve > continuous functions u, v exist such 
that u < X < V, J{v — u)dx < e. 

Often the function obtained on [0, 1]* by the step a) is Riemann integrable already, so 
this fact brings no restriction. 

The non-Riemann integrable functions encountered in practice, come from stochastic 
calculus related to Brownian motion for which quasi-Monte Carlo methods are not yet 
relevant. 

• The speed of these methods using low discrepancy sequences [cf Niederreiter 1978] is 
impressive in low dimension. A great advantage is also that the rate of convergence is 
given by a deterministic criterion — the Koksma-Hlawka formula — allowing to pro- 
duce the numerical value of the computed expectation with an explicit accuracy (cf [BL] 
Chapter 2C). But this speed is decreasing when the dimension increases, so that for large 
dimensional simulations these methods are at present irrelevant (see [BL] for the study of 
the realm of efficiency of these methods) . 
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II. The shift method 



A. The infinite dimension in practice 

Because the shift method is particularly relevant in large or infinite dimension, it is 
important to emphasize the fact that the case of infinite dimension is practically the most 

frequent. 

Indeed, most often, in the simulation, the number of calls to the random function, 
although finite, is itself random and unbounded. A typical example is the computation 
of the expectation of a stopping time of a Markov chain, e.g. the entrance time in a given 
set. 

But even for a finite dimensional probabilistic model the step a) will lead to the infinite 
dimension if the rejection method is used, and practionners know how rejection is often 
unavoidable. 

B. The principle of the shift method 

It is based on the pointwise ergodic theorem instead of the law of large numbers. 
Let A be a random variable simulated under the form 

A = F([/i,C/2,. ..,[/„,...) 

where yhe t/j's are i.i.d. random variables uniformly distributed on [0, 1]. In other words 
the UiS are the coordinate mappings from [0,1]°° equipped with the product Lebesgue 
measure to each factor. The method consists of putting 

IE(A) = hm ^JF{U,, U2, ...) + F{U2, Us, ...) + ■■■ + F{Un, U^+u ■ ■ ■)] 

instead of using in the case of the Monte Carlo method a double sequence of calls: 

]E(A) = lim ^[F(Un, Uu, ■ ■ ■) + F(U2i, U22, ...) + ■■■ + F(Um, Um, ■ ■ ■)] 

The theorem of Birkhoff says that the shift method converges almost surely and in as 
soon as A is in and in as soon as A is in 1 < p < cxd. 

C. General implementation manner 

Suppose we have written a procedure able to give us a sample of the probabilistic 
model we are studying. 

Instead of writing the main program using this procedure successively N times and 
doing the average, we shall write the main program using N times the procedure after 
initialisation and shifting one step the random function each time. 

The economy on the pseudo-random numbers generator is evident. 

D. Fine implementation for efficiency 

In order to exploit the full strength of the method the maximum amount of information 
of the preceding sample has to be kept for the following one. 

This can be done very efficiently by pointers. Let us take the example of a Markov 
chain for the explanation. 

Let us consider a Markov chain A„ with values in JR'^ which is simulated on [0, 1]°° in 
the following way: 

^n+l — F{Xn, n, Un+l), Xq — X 
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where F is a map from IR"' X N X [0, 1] into JR'^, and where the UnS RrG, cis before, the 
coordinate mappings of the cube [0, 1]°° to its factors. 

Let us suppose we want to compute the expectation of a functional of the process X, 
for instance ]E[G{Xt,T)] where G is a given bounded function and T the hitting time in 
the set Ac'R'^: T = mi{n > 0; X„ G ^1}. 

Denoting 6 the shift operator on [0, 1]°° defined by 



UnOO 



the ergodic theorem writes 



n+l) 



nGiXr, T)] = hm - G{Xt, T) o Or 

The use of pointers can be explained as follows. In order to apply the preceding formula 
we have to computeG(Xr, T) on successive points uj,6{uj),6'^{ijj), . . . , in [0,1]°°. These 
points are sequences of points in [0,1] : 



([/iH,[/2H,...,[/fcH,...) 

([/2(a;),[/3H,...,[/,+iH,...) 



If, as we suppose, the stopping time T is finite, we need only to compute a finite length 
of each of these sequences: the sequence ui is computed until T{ijj) which is given by the 
test 

Xk{uj) still outside A ^ k < T{uj) 
Xk{u)) G A for the first time ^ k = T{uj) 



< test > 



Suppose we have picked out the sequence u = {Ui,U2, ■ ■ ■) and we have put the numbers 
f/i, f/2, . . . in pointers as in the following figure: 





<test>=False <test>=False 



The sequence 9{uj) = (f/2, f/3, . . .) is already partially chosen: either it is long enough or 
it must be lengthened, in which case we have the new scheme: 



4 






<test>=False 



In these figures the arrows represent computations to do, essentially to compute 

Xfi+l = F{Xn, n, Un+l) 

When we compute F{x, n, y) however, only the variable x is new (and n becomes n + 1) 
and partial computations depending only on y can be stored in the box ofUn- This storage 
must of course be made while we are lengthening the sequence as explained above. 

Clearly this storage of partial computations cannot be done in the Monte Carlo 
method. 



E. Theoretical results on the rate of convergence 

We dont go into the details of these results here, see [BL] for mathematical proofs. 

There is no standard rate of convergence valid for every function in L^. Nevertheless 
in practice usual functions belong to a class called the Gordin class in which a law of 
iterated logarithm holds. 

This law of iterated logarithm involves a coefficient different from the variance, which 
can be either bigger or smaller than the variance even in the finite dimensional case. So 
even for integration in finite dimension the shift can be faster than Monte Carlo. 

From a practical point of view the possibility of storage of partial computations is the 
dominant phenomenon. 

F. Numerical example 

The simulated device is a transport particles problem. A particle arrives in coming 
from the real negative axis and enters into the square [0,1] X where it goes m 

straight lines during a random distance which follows an exponential law with parameter 
A, then it splits into two similar directions uniformly distributed on [— |, |] with respect 
to the direction of the preceding particle, and they behave as the initial particle provided 
they are inside the square. The quantities to be computed are the mean number of 
splittings and the mean number of particles leaving the cube by the righthand side. 

The following results have been obtained by simulating 500 000 samples by the Monte 
Carlo method (MC) and the shift method (Sh) 
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parameter 


0.980 


0.960 


0.940 


0.920 


0.900 


method 


MC Sh 


MC Sh 


MC Sh 


MC Sh 


MC Sh 


mean number 
oi splittings 


2.78 2.79 


2.92 2.93 


3.12 3.11 


3.34 3.31 


3.60 3.58 


mean number of 
particles through 
the righthand side 


3.97 4.00 


4.36 4.38 


4.95 4.90 


5.68 5.62 


6.72 6.65 


mean number of 

calls to the 
random function 


24.5 4.9 


27.4 4.9 


32 4.9 


37 5 


46 5 


ratio of the 
duration of the 
programs MC/Sh 


2.04 


2.19 


2.31 


2.40 


2.46 



III. Recommendations 

To become an expert in the art of using the shift, it is essential to keep in mind that 
the rate of convergence depends on the order of the calls 

Let us take a finite dimensional example: Let / be a bounded function from [0, 1]"^ 
into IR. The integral of / can be obtained by the shift by putting 

E/ = lim i-[/(f/i, ...,[/,)+ /([/2, + ... + ([/^, [/^+,)] 

N^oo iV 

Now we can also permute the coordinates defining with the permutation a 

and apply the shift to fa- 

The rate of convergence will be different in general, so that there are d\ different 
manners of applying the shift. 

In finite dimension these different manners give roughly speaking the same order of 
rate (see [BL] for a detailed study) 

But in infinite dimension it is quite different and for simulating the expectation of a 
functional of a random process this becomes very important: The idea of discretization of 
the process and shifting along the time is one of the worse way. Good implementations 
for functionals of Brownian motion and for solution to stochastic differential equations 
are given in [BL Chapter V]. 
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